Learning to Interpret and Describe Abstract Scenes
نویسندگان
چکیده
Given a (static) scene, a human can effortlessly describe what is going on (who is doing what to whom, how, and why). The process requires knowledge about the world, how it is perceived, and described. In this paper we study the problem of interpreting and verbalizing visual information using abstract scenes created from collections of clip art images. We propose a model inspired by machine translation operating over a large parallel corpus of visual relations and linguistic descriptions. We demonstrate that this approach produces human-like scene descriptions which are both fluent and relevant, outperforming a number of competitive alternatives based on templates, sentence-based retrieval, and a multimodal neural language model.
منابع مشابه
Mother-to-live experience of children with learning disabilities: a phenomenological study
The birth of a child for the mother is always accompanied by stress and anxiety, and if there are problems with the child, there will be emotions and emotions. Accordingly, the purpose of this study was to describe and interpret the experience of mother-child mothers with special learning disabilities in life. This research was conducted in a qualitative research method of phenomenological type...
متن کاملLived Clinical Learning Experiences of Medical Students: A Qualitative Approach
Introduction: many studies have been conducted regarding the settings of clinical medical education and its problems, but clinical learning experiences of medical students are less studied as a whole.The aim of this study was to explore, describe and interpret medical students' perception about clinical learning in order to obtain a deep insight about their clinical learning experience. Method...
متن کاملBeyond the Turing Test
Scenes The visual question-answering task requires a varietyof skills. The machine must be able to understand theimage, interpret the question, and reason about theanswer. For many researchers exploring AI, they maynot be interested in exploring the low-level tasksinvolved with perception and computer vision.Many of the questions may even be impossible tosolve given ...
متن کاملMeasuring Machine Intelligence Through Visual Question Answering
Scenes The visual question-answering task requires a variety of skills. The machine must be able to understand the image, interpret the question, and reason about the answer. For many researchers exploring AI, they may not be interested in exploring the low-level tasks involved with perception and computer vision. Many of the questions may even be impossible to solve given the current capabilit...
متن کاملA Novel Approach in Video Scene Background Estimation
274 Abstract—This paper presents a novel method for background estimation in a video sequence from the function estimation point of view. The proposed algorithm, called Kernel-based Background Learning (KBL), is designed based on kernel machine joint with learning schemes. In order to estimate background using KBL algorithm, we first interpret foreground samples as outliers relative to the back...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015